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A  Parallel  Divide  and  Conquer 

Algorithm  for  the  Generalized  Real 

Symmetric  Definite  Tridiagonal 

Eigenproblem 

Carlos  F.  Borges*and  William  B.  Gragg^ 


Abstract. 

We  develop  a  parallel  divide  and  conquer  algorithm,  by  extension,  for  the  generalized 
real  symmetric  definite  tridiagonal  eigenproblem.  The  algorithm  employs  techniques 
first  proposed  by  Gu  and  Eisenstat  to  prevent  loss  of  orthogonality  in  the  computed 
eigenvectors  for  the  modification  algorithm.  We  examine  numerical  stability  and  adapt 
the  insightful  error  analysis  of  Gu  and  Eisenstat  to  the  arrow  case.  The  algorithm 
incorporates  an  elegant  zero  finder  with  global  monotone  cubic  convergence  that  has 
performed  well  in  numerical  experiments.  A  complete  set  of  tested  matlab  routines 
implementing  the  algorithm  is  available  on  request  from  the  authors. 


1    Introduction 

We  consider  the  problem  of  finding  a  matrix  U  G  3inxn  such  that 

UT(T-SX)U  =  A- IX, 
is  diagonal,  or  equivalently 

UTSU  =  I      and      UTTU  =  A,  (1) 

where 
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and  6'  is  assumed  to  be  positive  definite.  This  generalized  eigenvalue  problem  has 
two  special  cases  that  are  of  interest  in  themselves.  They  are: 

1.  S  =  I,  the  ordinary  tridiagonal  eigenproblem. 

2.  S  =  I  and  ak  =  0,  the  bidiagonal  singular  value  problem  (bsvp),  by  perfect 
shuffle  of  the  Jordan  matrix 


0      BT  ^ 
B       0 


with  B  upper  bidiagonal  [16]. 

There  are  two  phases  to  the  divide  and  conquer  algorithm,  the  divide  (or  split) 
phase,  and  the  conquer  (or  consolidate)  phase.  We  shall  address  these  in  order. 


2    The  algorithm 

2.1    The  divide  phase 

Denote  by  e,  the  ith  axis  vector  where  the  dimension  will  be  clear  from  the  context. 
Let  s,  1  <  s  <  n,  be  an  integer,  the  split  index,  and  consider  the  following  block 
forms: 


T     = 


S     = 


ft-iejlj           a,  0sej 

eift  T2 

Si  ea-i7t-i 

T                  a  T 

ei7a  S2 


Note  that  s  =  n  is  possible;  then  T2,  S2,  and  ej  are  empty  [9,  10].   Suppose  we 
solve  the  subproblems 


Ul  (7]t  -  SkX)  Uk  =Ak-IX        (k  =  1,2). 


(2) 
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The  form  of  the  subproblems  is  preserved.     In  particular,  the  matrices  Sk   are 
positive  definite  and,  if  T  has  a  zero  diagonal,  so  do  the  matrices  7*.  Let 


U  = 


Ux 


Ui 


Then 


UT  (T  -SX)U  = 


U?{T!  -  SiA)!/!  UTe,-^,-!  -  7,-iA) 

(/?,_!  -  Js.1X)eJ_1U1  as  -  SSX  (ft  -  1$X)eJU'2 

U2Te1(ps-7sX)  U?(T2-S2\)U2 


2.2    The  conquer  phase 

The  conquer  phase  consists  of  solving  the  subproblems  (2)  from  the  divide  phase, 
consolidating  the  solutions,  and  finally,  solving  the  consolidated  problem.  Let 

ui  =  UJb„-x,         u2  =  Ujei, 
where  the  Uk  are  solutions  to  (2).  Then 


UT  (T  -  SX)  U 


Ai  -  IX  ui(/?,_i  -7,_iA) 

(ft_i  -  ys-iX)uJ  as  -  6SX  (j33  -  7,A)uf 

u2(^-7»A)  A2-/A 

The  right  side  is  the  sum  of  a  diagonal  and  a  Swiss  cross: 


UT(T-SX)U  = 


+ 


x 

X 

X       X       X       X       X 

X 

X 


This  can  be  permuted  to  an  arrow  matrix  by  a  permutation  similarity  transfor- 
mation with  Ps  =  [ei,e2)  ...,es_i,e5+i,  ...,e„,ea].  Thus 


A(X)     := 


PjUT  (T-  SX)UPS 

Ai  "lft-i 

A2         u2ps 


T  T 

L  T.--i"i      7»"2 


"1 7* -l 

"27a 


D       Bu 
uTB      a 


I       Cm 

uTC       7 


with 
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B  = 


Ps-ll 


P.I 


"2 
C  = 


ls-rl 


Is  I 


Since  S  and 


/        Cu 
uTC       t 
from  the  former.  Its  Cholesky  decomposition  is 


are  congruent  the  latter  inherits  positive  definiteness 


/        Cu 
uTC       7 


I 

uTC    p 


I     Cu 

p 


R1  R, 


with  p    =  7  —  u    C  u  >  0  the  Schur  complement  in  S  of 


Now 


R-L  = 


-Cxx/p 

Up 


and  a  second  congruence  transformation  with  R   1  gives 


.4(A) 


R-TA(X)R-1 


=      R- 


D       Bu 
uTB       a 


II 


-l 


D      w 

T 
W  L) 


IX 


-IX 
A -XI 


with 


(B-DC)u 

w     =     , 

P 

as  -  xxT  (2B  -  £»C)  Cu 

U     =     ? • 

We  have  reduced  the  conquer  step  to  the  problem  of  solving  an  ordinary  eigen- 
problem  for  a  symmetric  arrow  matrix.  If  V  is  an  orthogonal  matrix  with 

AV  =  VA 
and  A  diagonal,  then  (1)  holds  with 

U     =     UPsR~lV 


Ui  -Uiui-rs-i/p 

l/P 

C/2       -U2U2ys/p 


V. 
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It  is  useful  that  v*  =  UkUk  can  be  computed  in  O(n)  time  by  solving  S\\\  = 
e3_i  and  52v2  =  ei  using  the  Cholesky  factorization  Si  =  L\Lj  and  the  reverse 
Cholesky  factorization  52  =  LTL-j-  In  the  case  that  only  the  eigenvalues  are 
wanted  it  is  only  necessary  to  compute  the  first  and  last  rows  of  the  [/-matrices 
which  constitutes  a  further  savings. 

In  summary,  the  conquer  phase  proceeds  by  consolidating  the  subproblems  and 
building  a  full  eigenproblem  for  an  arrow  matrix. 


3    Solving  the  eigenproblem  for  the  arrow 

In  this  section  we  consider  the  solution  of  the  eigenproblem  for  a  real  symmetric 
arrow  matrix 


A 


D      b 
bT     7 


where  A  £  ?RnXn  is  symmetric,  D  =  diag(a),  a  =  [ai on-i]    i  ai  >  a2  >  •••  > 

an_i,  and  b  =  [/?i ,  ...,/3„_i]T  >  0.  When  A  arises  from  the  BSVD  then  a  is  odd 
and  b  is  even,  that  is  a  -f  Ja  =  0  and  b  =  7b,  with  J  the  counter-identity,  the 
identity  matrix  with  its  columns  reversed,  and  7  =  0. 

If  any  j3j  =  0  then  it  is  possible  to  set  A;  =  act  and  deflate  the  matrix  since 
ej  is  clearly  an  eigenvector  [28].  We  shall  call  this  /^-deflation  and  note  that  if 
(3j  <  tolp\\b\\  where  tolp  is  a  small  tolerance  then  a  numerical  deflation  occurs. 
We  derive  a  precise  value  for  tolp  in  section  4.4. 

A  second  type  of  deflation  occurs  if  applying  a  2  x  2  rotation  similarity  trans- 
formation in  the  (j,j  +  l)-plane  that  takes  j3j  to  zero  introduces  a  sufficiently  small 
element  in  the  (j,j+l)  position  of  the  matrix.  This  will  be  called  a  comio-deflation 
(see  [15]).  At  each  consolidation  step  we  perform  a  sweep  to  check  for  /^-deflations 
followed  by  a  sweep  to  check  for  com io-deflations.  The  co7/i6o-deflation  can  be 
arranged  so  that  the  ordering  of  the  o^  is  preserved  whenever  one  occurs.   After 

deflation  the  new  /?*+1   :=  .//??  +  /3j+l  >  /?j  +  1  and  hence  no  further  /^-deflation 

can  occur.  The  com&o-deflations  can  be  disposed  of  with  a  single  pass  by  backing 
up  a  single  element  whenever  one  occurs.  Note  that  deflation  is  backward  stable 
since  it  results  in  small  backward  errors  in  A.  Deflation  for  the  BSVD  is  more 
delicate  involving  a  simultaneous  sweep  from  both  ends  of  the  matrix.  Care  must 
be  exercised  at  the  center  of  the  matrix. 

After  deflation  the  resulting  matrix  can  be  taken  to  have  all  0j  >  0  and  the 
elements  of  the  arrow  shaft  distinct  and  ordered,  that  is  c*i  >  c*2  >  ■■■  >  an-i- 
An  arrow  matrix  of  this  form  will  be  called  ordered  and  reduced.  Henceforth,  we 
shall  assume  A  is  of  this  form. 

The  block  Gauss  factorization  of  A  —  XI  is 
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A  -XI 


I  0 

XI)-1     1  _ 

where  /,  the  spectral  function  of  A,  is  given  by 


bT(D 


D-  XI 
0T 


b 
-/(A) 


n-l 


/(A)  =  A 


01 


7  +  E-^-T- 


This  is  a  rational  Pick  function  with  a  pole  at  infinity  [1].  The  most  general  form 
of  a  rational  Pick  function  is 


/(A)  =  SX  -7  + 


n-l 


"J 


6  >0. 


(3) 


In  relation  to  the  various  divide  and  conquer  schemes,  the  case  6  >  0  corresponds 
witli  extension,  6  =  0  with  modification,  and  6  =  7  =  0  with  restriction  [7]. 

Inspection  of  the  graph  of  the  spectral  function  reveals  that  the  elements  of 
the  shaft  interlace  the  eigenvalues 


A]  >  ci i  >  A o  >  ...  >  ft„_i  >  A„. 


(4) 


Moreover,  in  the  present  case,  the  derivative  of  the  spectral  function  is  bounded 
below  by  one  so  that  its  zeros  are,  in  a  certain  sense,  well  determined. 


3.1    The  zero  finder 

The  fundamental  problem  in  finding  the  eigenvalues  of  an  arrow  is  that  of  providing 
a  stable  and  efficient  method  for  finding  the  zeros  of  the  spectral  function.  We 
now  examine  this  problem  in  some  detail. 

The  zero  finding  algorithm  we  present  is  globally  convergent  in  the  sense  that 
the  iteration  will  converge  to  the  unique  zero  of/  in  (ajt,QfJb-i)  from  any  starting 
value  in  the  closed  interval  [orjb,  tVfc_i],  where  we  put  c*o  =  +oo  and  an  =  — oo. 
The  zero  finder  converges  monotonically  at  a  cubic  rate  and  applies  to  a  general 
Pick  function  as  given  in  formula  (3). 


3.2    Interior  intervals 


The  iterative  procedure  for  finding  the  unique  zero  of  /  in  one  of  the  interior 
intervals  (at,n^_i)  proceeds  as  follows.  Let  xq,  a*.  <  xq  <  ctk-i,  be  an  initial 
approximation  to  A^ .  If  Xj  is  known  choose 


<t>j{x)  =  a  + 


CKk-l  ~  X 


+ 


Ui 


"fc 


so  that 
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<j>f\xj)  =  fW(xj),     t  =  0,1,2. 
Thus  a,  Wq,  and  u\  must,  satisfy 


1     (<**_!- z,)-!     (afc-z/ 

0       (ajfc_j   -  3'j)--        ((*£  -  !•_,) 

0     (arfc_i  -  z,)""3     (afc-Zj) 


-l 


a 

/(«i) 

w0 

= 

/'(*■;') 

Wl 

L  /"(*;) 

and  we  find 


a     =     .ix.j  -  (f  +  dk-i  +  tn-)  +      > 

^-^      a,  —  Xi     Oii  —  Xi     ct{  —  Xi 


(5) 


Since  wo  >  0  and  wj  >  0  it  follows  that  <j>j  is  a  Pick  function.  Thus  <j>j  has  a 
unique  zero  xJ  +  1  6  (<*fc)«*fc-i)-  Also 


Wq 


>  0I-1  >o   ,      w]  >/?|  >  o. 


The  error  function 


/(x)  -  c^(x)  =  x  -  (7  +  a)  + 


0? 


^-1  ~  ^u        /'if!  -  Wi 


otj  -  x        ajt_i  -  x 


ak  -  x 


has  7i  zeros,  counting  multiplicities.  There  are  n— 3  zeros  exterior  to  (o^.,  cu--i)  and 
three  more  at  xj.  Thus  the  error  function  crosses  zero  exactly  once  in  the  interval 
(a<.,a^_i).  Hence  SEj+i  lies  between  Xj  and  A/v.,  and  the  iteration  is  monotonically 
convergent  from  any  starting  guess  Xq  €  [<*Jt,ai-l]  as  claimed.  The  cubic  rate  of 
convergence  follows  from  (5). 

Successive  iterates  can  be  found  by  solving  quadratic  equations.  Rather  than 
solve  <f>j(x)  —  0  for  Xj+\  it  is  better  to  solve 

^■(xi-A)  =  0 

for  the  increment  A  =  Xj  —  Xj+\.  Some  rearrangement  using  (5)  reduces  this  to 

cvA2  + /?A  -  /  =  0,  (6) 

with 
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(ofc_i  -  Xj){xj  -  at)' 


\Ock-l  ~  Xj  Offc  -  Xj  J 

When  shifts  of  the  origin  to  the  nearest  pole  [15]  are  used  then  one  of  at_i  or  at 
is  zero.  The  computation  of  0  —  0(xj)  should  account  for  the  fact  that  it  has  only 
simple  poles  at  at-i  and  a*. 

If  we  start  at  the  midpoint  of  the  interval,  xq  =  (o.t-i  +  c*t)/2,  then  we  always 
have  0  =  /3(Xj)  >  f'(xj)  >  1.  This  can  be  seen  by  noting  that  0(xo)  =  f'(x0) 
and  that  when  xy  >  At  then  for  all  of  the  succeeding  iterates  f(xj)  >  0,  by 
monotonicity,  and  1_j. — h  ]_x  is  negative.  If  xq  <  At  a  similar  argument 
applies.  It  follows  that  the  increment  can  always  be  computed  stably  as 

a=    m    .  (9) 


3.3    Exterior  intervals 

The  treatment  of  the  two  exterior  intervals  is  geometrically  the  same  as  above. 
Again,  the  approximating  function  has  poles  at  the  endpoints  and  the  residues  at 
these  poles,  and  the  constant  term,  are  chosen  to  satisfy  (5).  We  present  the  case 
for  the  interval  (n^oo),  the  case  for  the  other  exterior  interval  being  similar.  Now 


<j)j(x)  -  u0x  -  a  + 


"i 


with 


X—  Pi         a,  -  cti 


u>o     =      1  +  >      .      f  '      v< >       1, 


The  inequalities  are  strict  unless  n  =  '2.  Again  we  find  (6)  where  now 

"•"  L>i=2    (r,-o,)-'  Xj-Oj 


0       =       /'(*>)  + 


Xj   -  a  1 


Xj  -  «! 

These  are  limiting  cases  of  (7)  and  (8)  (  introduce  another  pole  c*o  >  <*i  and  let 
fto  — *  +oo).  If  Xo  >  Ai  then  /(A)  >  0  so  0  >  f  >  1  and  A  is  again  computed 
stably  using  (9).  We  obtain  global  monotone  cubic  convergence  as  before. 
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Contrary  to  the  algorithms  of  [11,  12,  10]  our  algorithm  is  well-defined  when 
starting  at  the  endpoints  of  the  intervals.  The  algorithm  of  [23]  can  start  at  the 
endpoints  but  has  only  quadratic  convergence. 

To  guarantee  that  xo  >  X\  we  take  Xq  to  be  the  iterate  in  (oi,+oo)  from  +oo. 
As  xq  — ►  +oo  the  approximate  Pick  function  tends  to 


4>(x)  =  x  -  7  + 


1 1  >  1 1  - 


«i  —  x 
Our  actual  starting  guess  is  the  zero  of  (10)  in  (cvi,-foo): 


3'0  =    S 


ai  +  ^  +  Vv2^)    +llbH2     •      '>>«!■■ 
llbll2 


«i  + 


^^a/C^P")  +l|b| 
When  shifts  are  used  we  have  cvj  =  0. 


i      7  <  »!■ 


(10) 


3.4    Orthogonality  of  the  eigenvectors 

It  is  essential  that  the  computed  eigenvectors  of  the  arrow  matrix  be  numerically 
orthogonal.  As  a  point  of  entry  into  the  further  analysis  of  the  algorithm  we  now 
examine  the  orthogonality  of  the  eigenvectors  following  [15]. 
Consider  the  divided  difference 


/(A,/. 


/(A)  -/(//) 
A-,, 


n -l 


^  (a, 


fl 


A)(a;-/0 


=     1  +  hJ(D-  Xiy'iD-  itiy'b. 


Note  that  \i  =  X  gives  /'(A)  =  1  +  ||(D-  A 7 ) ~ J  l>] jr .  If /(A)  =  0  then 

v(X)  = 


X-Q 

1 


is  an  eigenvector  of  the  arrow  matrix  A 
value  A,  and 

u(X)  = 


D     I) 


(n; 


associated  with  the  eigen- 


v(X) 


Jnv 


is  the  normalized  eigenvector  whose  last  element  is  positive.  The  ordering  of  A 
implies  that  its  matrix  of  eigenvectors  can  be  taken  positive  below  and  on  the 
diagonal,  and  negative  above. 
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Let  /(Ao)  =  /(/'o)  =  0  with  A,,  /  /t0.  Thus  A0  and  /io  are  distinct  eigenvalues 
of  A.  The  eigenvectors  u(\q)  and  u(//o)  are  orthonormal: 

u{\o)Tu{l'o)  =  /(A0,/'o)  =  0. 
Let  A  and  fi  be  approximate  eigenvalues  in  the  sense  that 

'       07 -A0'  '  ;l  -  1  +  6' 

(12) 

-*'  =  il^L,         |«|<-L, 

J       or,-  -//„  '  Jl  -  1  +  6 

Here  6  >  0  is  hopefully,  but  not  necessarily,  close  to  the  machine  unit  c.  Note  that 
(12)  is  equivalent  with 


-A  (\ ,  —  it  „, 

—  =  1  +  <5, ,      — — -  =  l  +  6't. 


These  conditions  imply  that  the  approximate  eigenvectors  u(A)  and  u(i-i)  are 
nearly  orthogonal.  For  we  have 

vTWW^A)7  «(/0     =     /(A,//)-/(A0)/io) 


jTi  ("j  -  AH°j  -  /')  V         (aj  -  Ao)(o,-  -  /io) 

=   T - to +  3  + MO- 

Since 

-/       -  -,,         26  6'-' 

6,  +  K  +  M      <   :  +  —  <  26 

1  3       ]       J  3l  -  1  +  6       (1  +  6)?  - 
then 

n//'(A)/'(//)</(A)7' ,/(//)  =  2oTr(D  -  A/)-'e(D  -  /i/)"^ 
with  |9|  <  /.  Thus 

N//'(A)/'(//)|I/(A)7'»(//)|  =  2/i||(/-;-A/)-1l..||2||(D-///)-1b||2l 
and  so 

|«(A)T«(/i)|  <  26. 

Condition  (12)  is  stringent.  If  we  let  ;4  —  0  then  it  is  easy  to  show  that  A 
can  have  an  eigenvalue  Ay  =  Ao(/4)  —  (w.  +  0(/?jji);  (12)  then  requires  that  the 
approximate  eigenvalue  A  satisfies  a  bound 
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|A-  Ao|  <0(6$), 

which  is  difficult  if  A/||l>||  is  only  somewhat  larger  than  machine  precision,  say 
e3'4.  Two  techniques  are  used  to  attempt  to  satisfy  (12)  -  shifts  of  the  origin  [15], 
and  simulated  extended  precision  (sep)  arithmetic  [26,  14].  Condition  (12)  means 
that 

|A  -  A0|  <  6min{A0  -  ctktak-i  -  A0). 
When  shifts  are  used  it  means  that  A  is  nearly  //(Aq). 


4    Numerical  stability  of  the  algorithm 

We  now  give  a  partial  analysis  of  the  stability  of  this  approach  to  the  eigenproblem 
for  the  symmetric  arrow  matrix.  Observe  that 

/(a)  =  p{A)  -  n;=.(A-Aj) 


-7(A)    n;:/(A-a;) 

The  following  inverse  eigenvalue  problem  [0]  is  important:  given  {cvj}  and  {Aj} 
satisfying  (4),  find  {/3j }  ami  -7  so  that  A(.-l)  =  {Aj}.  This  problem  is  simply  solved 
by  computing  the  residues  of  the  partial  fraction  decomposition  of/.  In  particular 

a-..,  v(A) 

n;=.("*-\) 

n  n-l 

j = 1         ./  =  1 

For  fixed  {«_/},  the  elements  of  the  arrow  head,  {/?/.)  and  y,  are  explicitly  known 
functions  of  the  eigenvalues. 

Now  let  {Aj}  be  a  set  of  approximate  eigenvalues  of  A  satisfying  (4).  Then 

ft    =    -";=''"'"Aj!       (A>0),  (.3) 

n  n-1 

define  a  modified  matrix  A  with  A(.4)  =  {Aj  j.  To  obtain  a  backward  error  analysis 
for  the  complete  eigen?>a/«f  problem  we  bound  the  differences  J3k  —  /3k  and  7  —  7. 
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4.1    Error  analysis  for  the  Dongarra-Sorensen  condition 
We  give  an  error  analysis  using  the  Dongarra-Sorensen  condition 


-^-T-  =  ^<     IMS  *• 

where  6  =  0(f)  is  of  the  order  of  the  machine  unit,  simplifying  that  in  [6]. 
Rearrangement  of  (15)  gives 


(15) 


It  follows  that 


Aj  -  ft*  =  (Aj  -  (i/..  )(1  +  fy,*)- 


i  =  i  \        j '  =  i 


A  =  A    l  +  jE% 


J=l 


with  the  <5'  fc  and  6" k.  at  most  only  slightly  larger  than  the  6,-*.  Thus 


11  -i 
c  -6' 

—  •> 


where  6"  =  0(e)  is  only  slightly  larger  than  di. 
Now  (1-1)  becomes 

'I   =  ")  +H(AJ  -  akU))6i,kU) 

j  =  l 

with  »i(j)  one  of  the  poles  of/.  Thus 

h  -7i  <  *^2\xj  ~aku)\- 

To  minimize  this  bound  we  choose  i\i:[J)  to  be  a  pole  of/  closest  to  Aj.  Clearly, 
ajk(l)  =  "i  and  a*.,,,,  =  a„_i,  so 

/ 
|7-7|  <  6  I  (Aj  -oj)  +  22  \Xj  -  ak(j)\  +  (a»-i  "  A»< 

For  1  <  j  <  ?i  a  closest  pole  to  X}  is  either  ftj  or  ftj_i.  The  distance 

Iaj  _  n'.(J)l  =  min  {\j  -  n-j,ftj_]  -  A;) 
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is  maximized  when  A;  is  the  midpoint  of  the  interval  (oj,aj_i),  and  the  value  of 
the  maximum  is  (a;  +  Oj_i)/2.  Thus 


It  —  tI    <    M  (Ai  -  (|i)  +  -  5Z('»j-i  -  "j)  +  (£>».-i  -  An) 

<     6(Ai  -An)  <  26||y4||2. 

In  summary,  the  Dongarra-Sorensen  condition  implies  small  relative  errors  in 
each  pk  and  a  small  absolute  error  in  - .  For  the  BSVD  this  implies  small  element- 
wise  relative  errors  since  the  condition  -  =  -  =  0  is  enforced  by  Xj  +  \n+i-j  =  0 
(only  half  of  the  eigenvalues  are  actually  computed,  the  rest  follow  from  this  con- 
dition). 


4.2    Rounding  error  analysis  of  the  computation  of  /(A) 

The  choice  of  a  termination  criterion  depi  nds  on  a  careful  rounding  error  anal- 
ysis of  the  particular  manner  in  which  we  compute  /(A).  Let  {«,},  {A},  and  7 
be  floating  point  numbers.  We  represent  A  as  the  ordered  pair  of  floating  point 
numbers  (u,fi)  where  the  shift  a  is  a  pole  closest  to  A,  and  A  :=  a  +  //.  For  the 
exterior  intervals  we  have  a  =  c\\  or  a  =  o„_i.  For  the  interior  intervals  a  can  be 
determined  by  evaluating  /  at  the  midpoint  and  checking  the  sign.  We  compute 
/(A)  as 

u  -  1  p 

fAv)  =  Y,-TJ—  +  (/<-7'), 
with  the  standard  operation  precedence  rules,  where 

0    =  c\j  —  cr      ami       */    —  '  —  cr. 

We  use  Wilkinson's  notation:  fl(x  *  y)  -  (j:  *  //)(1  +  8)  with  |(^'|  <  f/(l  -I-  f) 
and  e  =  2-'  the  machine  unit.  More  generally,  (  denotes  numbers  not  essentially 
larger  than  2-<  [27]  and  the  rounding  errors  6  satisfy  \8\  <  c. 

We  define 

fl(aj  -A)  :=//(a5  -//)  =  //((..J  -a)-//). 
If  a  =  at  then 

//(a*  -  A)  =  -//  =  n-k  -  A, 
with  no  rounding  error.  For  j  /  *•, 
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fl(ttj  -  A)  =  (aj  -  A)  I  i  +  Hi—2±6  +  6'), 


and  since  cvjt  is  a  pole  closest  to  A  then 
computed  with  small  relative  errors: 


'I,'"!"/     <  2.   Thus  all  terms  a;  —  A  are 


//(ftj  -A)  =  («j  -A)(l+3^),      1^1  <c. 


(16) 


When  computing  /(A)  =  /„(//)  \vc  add  the  term  A  —  -)  =  (A  —  a)  —  (7  —  a)  last. 
A  routine  error  analysis  using  (10)  and 

"  -  1         rj2 

|A-7l<l/(A)|  +  ^1— ^t 
to  eliminate  the  term  |A  —  -  |  from  the  error  hound  gives 


"  - !         l# 


l//(/(A))  -  /(A)|  <  c     :{|/(A)|  +  \a  -  \{+  (n  +  5)  V  r 


# 


;=i 


whicli  implies 


«-i       ^  2 


l//(/(A))|  <  (1  +  :i<  )|/(A)i  +  «  I  |ff  -  A|  +  (»  +  5)  ]T       ^       I  .        (17) 


7=1 


4.3    Termination 

Our  goal  is  to  choose  a  terminal  ion  criterion  so  that  we  stop  when  A  is  as  close 
to  the  true  eigenvalue  A/,,  as  possibl.  Let  //  =  Xk  in  (11)  with  /(A*)  =  0.  Now 
fU-  <  A/.-  <  a/.--!-  Also  lei  n  1,  -  A  ■  '!/.-!•  Then  the  terms  a;  —  A  and  c*j  —  Ajt 
have  the  same  sign  and 


A -A,  I  < 


/(A)| 


+  Ej  =  l     |„,-A|K-At| 


(18) 


To  obtain  an  upper  hound  for  |A  -  A/,. |  we  need  an  upper  bound  for  |/(A)|  and  a 
lower  bound  fur  the  denominator.  For  the  latter  we  have 


n-  1 


1      n 


1  + 


^ 


1  + 


;  =  i 


J  "M"j 


maxj  |cvj  -  A*  I 


(19) 


Let  us  determine  how  small  |/(A)|  is  when  A  is  the  rounded  representation  of 
A^. .  This  is 
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A      :=     ff  +  fl(nk)  =  <r  +  iik[l+6) 
=     Afc  +  nk6  -  At  +  (Ajt  -  <r)6 
and  we  have 

|<7  —  A/;|  =  inin  \c\j  —  \ii\. 

j 

Thus 

f'£' uT x 

J  =  l  («j  -  A)(cij  -  A/,) 
=      |(cr  -  A;,  )'v    I  1  +  S 


-A,|  +  V— ^ 
^    u,  -A 


<       (    I    Iff 

From  (17), 


n  -  1  ?2 

l/'(/(A))|<<  I  ltr-A,.|+|A-fr|  +  (»  +  ())V         J 


Since  A*  -  a  =  (A  -  <r)/(l  +  (*>)  il 


IA/(A))|<<     2|A-<r|  +  (n  +  6)]T 


9 


We  terminate  and  set  A/,  :=  A  when 


n  -  I  yj 

Al 


|//(/(A))|  <  2c     2|A  -  ct\  +  (?j  +  6)  >     t— i- 
Inequality  (17)  also  holds  if  /(A)  and  //(/(A))  are  interchanged.  Thus 


|/(Afc)|<<  [5|AA.-(7|  +  (:^+17)V— ^-r-1  .  (20) 

From  (18)  and  (19) 

.•)k-A,i  +  (.H,+  i7)E;:11-4- 

lA*  -  A/.  I  <  (  max  | A/,.  -  cij  | —, . 
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Since  \a  —  At |  <  \cr  —  A/.. |  +  |A,,  -A/,. |  and  \a  —  Ajt |  <  maxj  |u;— At|  the  computed 
eigenvalues  satisfy 


| Ay.  —  A/.  |  (3?i  +  17)(  max  \c\j  —  At 

j 

<     G(7)  +  6)(||/l||2. 


(21) 


4.4    Error  analysis  for  the  Gu-Eisenstat  condition 

From  -)  —  ■)  =  ^3j  =  i(^j  ~  ^j  )  '""'  (- ' )  wr  '""' 

|-  _-|  <  rm(„  +  6)c|L4||2. 

We  have  noted  thai  the  DougaiTu-Soren.sen  condition  (1JJ)  is  stringent.  It  is 
natural  to  ask  for  small  absuluh  errors  in  the  .^,.  If  we  replace  6,- *  by  bjtk/Pk  in 
the  analysis  in  section   1.1  we  liud  that 


i  e 


71  -1 


*=*     1+     ^A      =*+     S«" 


J  =  1 


and 


M;.  --4I  <  ^".      |*"|  <  *'(!  +O(0), 


are  implied  by  the  Cu-Eiscnslal  ininliliuii 

A,  -  A, 


o  -A, 


=  'j/.-      I"j,fr|  i 


ft,  /,l  <  ft. 


We  must  bound  ft. 
From  (20) 


/ 


lAjt-Afcl 


i  +  E 


"  - 1  /32 


\        j  =  i  (mj  ~  A'  K"j  -  A/. 
with  ?);  =  3(7/  -f  (i).  Using 


<n„   \\l.-<t\  +  j2 


% 


,  In,-  —  At I 


|A  -  'T;  <  |A  -  A,,|  +  |A/..  -  a\ 
and  the  Gu-Eisenstat  inequality, 

1 1/2 


|(ij  —  A| 


< 


+ 


At  -  Al 


(oj  -  Aji.Mj  -  A,,)  |  (orj  -  A)(c»j  -  At)' 


we  get 


Ciencralizecl  Divide  and  Conquer         17 


|Ai.  -  Ai.l  <  uu 


T"- 


|A,,.  -  ct\  + 


';  =  '    [(a_,-A,  K«rv>  — A*  )]'/- 


\ 


r;: 


(Oj-A*  (((.j-A/,.) 


where  e  has  been  increased  to  (/( I  —  m<  ). 
By  Cauchy's  inequality, 


lAjk-Ajtl     <     mi     \Xk  -a|  + 


, ^ 


1/2 


<       TIM   MA/.-  -  rr|  -+■  ^   ^  -  A/..)(cvj  -A/.-) 


1/2 


for  every  j.  The  arithmetic-geometric  mean  inequality  and  the  triangle  inequality 
yield 


\Xk  -  Ajt|  <  me  (  | Xk  -<r\-\-  ^—±  Moj  -  Ai;|  +  -|A,,  -  A 


Thus 


TIM  ||1.| 


lAt  -Ai.|     < 


A/.  -  rr   +   n,  -  A;..  —— 

H  ■ 


M"v-A, 


+ 


I  A;,  -  <r|     fl 

|A/,  -a;|||b| 


<      2mc||b||^- 


If  me||b||  <  ft  for  all  j,  then 


and  consequently 


l<V/.|  <  4»im||1»| 


|/4  -  th-\  <  <'"l"  +  •>)<  ||b||. 

Thus  tolji  is  7/u .  If  >jk  <  3(tj  -j-  (i)(|j/'!|  \w  replace  rfk  by  '"'t'0  and  accept  q^  as 
an  eigenvalue  with  normalized  eigenvector  o.k . 

The  computed  eigenvectors  of  j4  are  iak<  n  lo  be  those  of  the  nearby  matrix  A. 
Because  of  (13)  ami  (1G)  they  are  computed  to  high  relative  precision  elementwise 
and  hence  are  numerically  orthogonal  ['201 . 
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